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Abstract 

We prove that approximating the size of stopping and trapping sets in Tanner graphs of linear 
block codes, and more restrictively, the class of low-density parity-check (LDPC) codes, is NP-hard. 
The ramifications of our findings are that methods used for estimating the height of the error-floor of 
moderate- and long-length LDPC codes based on stopping and trapping set enumeration cannot provide 
accurate worst-case performance predictions. 

1 Introduction 

In the past decade, the search for efficient and near-optimal decoding algorithms for linear block codes cul- 
minated with the rediscovery and generalization of the notion of sparse codes and iterative message passing 
algorithms. Although Maximum Likelihood (ML) decoding of linear block codes is NP-hard [5], iterative 
decoders can approach the Shannon limit of reliable communication with polynomial time complexity, pro- 
vided that they operate on codes with long length that have sparse parity-check matrices, also known as 
LDPC codes [17]. Decoding is achieved via message passing on the Tanner graph of the code, a suitably 
chosen bipartite graphical representation of the code which contains a very small number of edges. On such 
graphs, probabilistic inference of the form of iterative message passing is known to have linear complexity 
in the code length. 

The performance of linear block codes under iterative decoding, and the performance of LDPC codes in 
particular, depends on the structural properties of their chosen Tanner graphs. For each channel-decoder 
pair, there exist vertex configurations in the code graph on which the given iterative decoder fails. For 
some frequently encountered Discrete Memoryless Channels (DMCs), such configurations are known as 
near- codewords [26] . trapping and stopping sets [121 131] , pseudocodewords [411 122] , and instantons [36] . 

It is known that ML decoders fail when transmission errors are confined to Tanner graph configurations 
containing codewords, while iterative decoders usually fail to make correct decisions on (strictly) larger sets 
of configurations. For example, iterative edge-removal (ER) decoders for signalling over the Binary Erasure 
Channel (BEC) fail on stopping sets [12] , a subset of which are the codewords themselves. For the Additive 
White Gaussian Noise (AWGN) channel and sum-product decoding, failures arise due to subsets of vertices 
in the code graph that have similar structural properties as codewords, and are consequently termed near- 
codewords [26J. As a result, iterative decoders exhibit sub-optimal performance compared to ML decoders, 
and this performance loss most frequently manifests itself in terms of the emergence of error-floors in the 
Bit-Error-Rate (BER) curve of the code. 

The error-floor phenomena is a problem of focal importance in the theory of iterative decoding, since 
many practical applications of codes on graphs require extremely low operational BERs. Since such low BERs 
are well beyond the scope of current Monte-Carlo simulation techniques, several methods were proposed for 
estimating the height of the error-floor through enumerating small stopping and small trapping sets [34], 
and exploring dominant instantons |31[ 136] . These techniques operate fairly accurately for codes of very 
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(a) Stopping Set. (b) ZP Trapping Set. 



Figure 1: Examples of Stopping and ZP-Trapping Sets. 




(a) AWGN (a, fc)-Trapping Set. (b) AWGN Elementary (a, fe)-Trapping Set. 

Figure 2: Examples of AWGN Trapping Sets. 

short and moderate length and small minimum pseudoweight, but they are time consuming, and no rigorous 
analytical study of the performance of these search procedures is known. 

Recently, it was shown that the problem of finding the smallest stopping set in an arbitrary code graph 
is NP-hard to approximate up to a constant term [27] . In [3D] , it was shown that finding the smallest k-out 
set, which represents a straightforward generalization of the notion of a stopping set, is NP-hard as well. 
Despite the fact that fc-out sets may lead to decoding failures similar to those caused by trapping sets, the 
results in I4TJ] do not capture the fact that trapping sets are usually characterized in terms of two parameters. 
Furthermore, the notion of a trapping set is meaningful only in conjunction with a fixed decoding method. 
Finally, no hardness results for approximating k-ont sets or more general trapping sets are currently known. 

The main contributions of our work are three-fold. First, we improve upon the hardness results for 
approximating stopping sets, presented in [27] . Furthermore, we introduce the notion of a cover stopping 
set, and show that the problem of finding such a set of smallest cardinality in an arbitrary Tanner graph is 
NP-hard. Second, we provide a set of new results regarding the hardness of finding trapping sets for Gallager 
A decoder (GA) [4], the Zyablov-Pinsker (ZP) decoder [43l|42], and the product-sum decoder. The third, 
and most important finding presented in the paper is that these hardness results carry over to the case of 
LDPC code graphs (provided that the notion of "low-density" is properly defined). We discuss the impact 
of these findings on the accuracy of estimating the error-floor based on trapping set enumeration techniques. 
In addition, we give a brief overview of the theory of fixed parameter tractability (FPT), and show that the 
minimum cover stopping set problem is FPT. 

The paper is organized as follows. Section [5] introduces the trapping set structures under investigation, 
as well as their corresponding decoding algorithms. Section provides a brief overview of a class of NP-hard 
problems that are used in the reduction proofs of our main results. Section 2] contains theorems regarding 
the hardness of approximating classes of trapping sets, while Section [5] specializes these results for the class 
of sparse code graphs and short code lengths. In Section [6] we briefly comment on the accuracy of error-floor 
estimation procedures relying on exhaustive trapping set enumeration techniques. In Section [71 we describe 
the notion of fixed parameter tractability and its implications for stopping and trapping set size estimation. 
Concluding remarks are given in Section U 

2 Definitions and Problem Formulation 

A binary, linear [n, k, d] code C is a fc-dimensional vector subspace of an n-dimensional vector space ■ The 
generator matrix M of the code C is a k x n matrix of full row-rank, with rows that correspond to basis 
vectors of the subspace. The parity-check matrix H of C is the generator matrix of the null-space of the 
code. The matrix H defines a bipartite graph G = (I U with columns of H indexing the variable 

nodes in L, and the rows of H indexing the check nodes in R. For i 6 L and j £ R, {i,j) £ E if and only 
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if Hi j = 1. The graph G is called the Tanner graph of C with parity-check matrix H. If the parity-check 
matrix of a code contains only a "small" number of non-zero entries, i.e., it is sparse, then the corresponding 
code is called a Low-Density Parity-Check (LDPC) code. A precise definition of the notion "small" will be 
given in Section 

For the remaining definitions in this section we need to introduce the following notation and definitions. 
For S C L, the notation r(5) is reserved for the set of neighbors of S in R. Gs denotes the induced subgraph 
for S C L which is defined as the graph on nodes SUT(S) with edges {(ii,«):u£S,!)G r(S)}. Equivalently, 
Gs is the Tanner graph of the punctured parity-check matrix of the code, consisting of the columns indexed 
by S. For any graph G", V(G') denotes the set of nodes of G' and E(G') 

Iterative decoders are a class of inference algorithms that operate on Tanner graphs of codes. These 
decoders arc known to compute the maximum likelihood estimates of variables only on Tanner graphs free 
of cycles. Nevertheless, when applied to LDPC codes that contain cycles, they can approach the Shannon 
limit on optimal performance with complexity linear in the length of the code. 

The messages passed between vertices of the Tanner graphs during iterative decoding depend on the 
characteristics of the transmission channel, and there usually exist many different iterative decoding methods 
that can be used for the same channel. For various decoder architectures specialized for the BEC, BSC, 
and AWGN channel, the interested reader is referred to [35] • For clarity of the future exposition, we briefly 
describe three of these procedures: the edge-removal (ER) algorithm, the Zyablov-Pinsker (ZP) bit-flipping 
method [T5J [25J H2] > and the regular Gallager A algorithm [7] . The first algorithm operates on outputs of the 
BEC, while the second two are designed for the BSC. A detailed description of different decoding procedures 
for signalling over the AWGN channel can be found in [51] . 

The ER algorithm is used for codes transmitted over the BEC channel, where the input to the channel 
is a vector c\C2 ■ • ■ c„ E C, and the output is a vector v\V2 ■ • ■ v n over the symbol alphabet {0, 1, e}. For a 
BEC channel with erasure probability p, one has Pr[i>, = ej] = 1 — p, and Pr[vi = e] = p. The ER algorithm 
assigns to each vertex i in L of the Tanner graph of C the symbol wj. It then iteratively searches for vertices 
in R adjacent only to one e symbol in L. Due to the even-parity restriction, the corresponding a value for 
such a symbol can be uniquely determined. The decoder terminates either when the correct codeword is 
recovered or if every every parity-check vertex connected to one e symbol is connected to at least two such 
symbols. In the latter case, we say that the decoder failed on a stopping set. 

Definition 1 (BEC Stopping Sets). Given a bipartite graph G = (L U R, E), we say that S C L is a 
stopping-set if the degree of each vertex in T(S) in the induced subgraph Gs is at least two. 

Of independent interest is the problem of determining the size of the smallest stopping set S such that 
r(5 l ) = R, i.e., the smallest set of vertices that covers each check node in R at least twice. We refer to such a 
set as the cover stopping set. If symbols corresponding to a cover stopping set are erased, then the decoding 
process terminates before proceeding with the first iteration, and no erasure can be corrected. 

Assume next that the Tanner graph of C is left-regular, with degree £. For a BSC channel with error 
probability p, the word V1V2 ■ ■ ■ v n E {0, 1}™ and Pr[uj = Cj] = 1 —p, and Prkvj = 5j] = p. In the first iteration 
of ZP-decoding, the decoder scans for received symbols Vi that are connected to £ unsatisfied parity-check 
equations. If symbols with such a property are encountered, the decoder flips their values sequentially. The 
procedure is repeated for vertices with £—1, £ — 2, £— [(£— 1) /2J unsatisfied check-equations. The decoder 
terminates by either recovering the correct codeword or by encountering a word for which each symbol is 
included in less than £ — [(£ — 1)/2J unsatisfied check-equations. In the latter case, we say that the decoder 
failed on a ZP trapping set. 

Definition 2 (BSC ZP-Trapping Sets). Let G = (LU R,E) be a left-regular bipartite graph with degree 
£. We say that S C L is a ZP-trapping set if the induced subgraph Gs is such that all vertices in S are 
connected to less than £ — [_(£ — 1)/2J odd degree vertices in Gs- 

Another frequently used iterative decoding algorithm for signaling over the BSC that has a complete 
characterization of trapping sets is the Gallager A algorithm for regular codes with left vertex degree £ = 3. 
The decoding rule is straightforward: unless all incoming massages to a variable node are identical, the 
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Figure 3: Example of Majority Trapping Set. 

variable node transmits its received symbol. Otherwise, the node transmits the consensus vote. On the 
other hand, the check-nodes pass on their parity estimates to their neighboring variable nodes. 

Definition 3 (BSC GA- Trapping Sets). Let G = (L U R, E) be a bipartite graph with left-degree three, 
such that all vertices in R have degree r > 3. Let T C L and let Gt be the subgraph of G induced by T . 
Let O = {v G r(T) : deg GT (v) odd}. We say T is a GA-trapping set with parameter a if \0\ — a and if 
|r(u) n 0\ < 1 for each u S T and no two checks in O have a common neighbor in L\T . 

For the AWGN channel, and message-passing algorithms, no precise analytic characterization of failing 
configurations is known. Extensive computer simulations [311 I24j show that errors are usually confined 
to near codewords, also known as trapping sets or instantons. Roughly speaking, trapping sets resemble 
codewords in so far that they result in a very small number of unsatisfied check equations (for codewords, 
this number equals zero). We focus our attention on three such configurations, defined below. 

Definition 4 (AWGN (a, 6)-Trapping Sets). Given a bipartite graph G = (LU R,E), we say that S C L 
is an (a, 6)-trapping set if \S\ = a and the induced subgraph is such that T(S) has exactly b vertices of odd 
degree. Similarly, we say that S C L is an elementary (a, 6)-trapping set if b vertices in T(S) have degree 
one, and \T(S)\ — b vertices have degree two. 

Definition 5 (AWGN Majority Trapping Set). Given a bipartite graph G = (L U R, E) we say S C L is 
good if the induced subgraph Gs is such that the majority of vertices of Gs in T(S) have even degree. T is 
a majority trapping set if T and L\T are both good. 

Examples of Tanner graphs including stopping sets, ZP-trapping sets, as well as AWGN trapping sets 
are shown in Figures [1] [5] and [31 respectively. Circles denote variable nodes in L, while squares denote check 
nodes in R of the Tanner graph G(L U R, E). 

Complexity Theory: A problem belongs to the class NP if it can be solved in polynomial time by a non- 
deterministic Turing machine. Alternatively, the complexity category of decision problems for which answers 
can be checked for correctness using a certificate and an algorithm with polynomial running time in the size 
of the input is known as the NP class. A problem is NP-hard if the existence of a deterministic polynomial 
time algorithm for the problem would imply the existence of deterministic polynomial time algorithms for 
every problem in NP. This consequence is widely believed to be false, and hence determining that a problem 
is NP-hard is a very strong indicator that the problem in computational intractable, i.e., no deterministic, 
polynomial time algorithm exists for the problem. 

For optimization problems, there exists a large body of work that considers approximate solutions rather 
than exact solutions [33] ■ When minimizing a function subject to constraints, we say an algorithm is an 
a-approximation algorithm if it always returns a solution whose value is at most a factor a greater than the 
value for the optimal solution. For some NP-hard problems, it is possible to show that it is also NP-hard 
to a-approximate the problem. For a more thorough treatment of these and other subjects in complexity 
theory, the interested reader is referred to [20] . 
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Our Results: We are concerned with the worst-case computational complexity of the following problems. 

1. MinStop: Find a stopping set of minimum cardinality. 

2. MinCStop: Find a cover stopping set of minimum cardinality . 

3. MinTrapzp: Find a ZP-trapping set of minimum cardinality. 

4. MinTrapqa: Given a, find a GA-trapping set of minimum cardinality. 

5. MinTrapawgn^ Given a, find an (a, 6)-trapping set with minimum parameter b. 

6. MiNTRAPAWGN-ciom : Given b, find an (a, 6)-elementary trapping set with minimum parameter a. 

7. MiNTRAPAWGN-maj : Find a majority trapping set of minimum cardinality. 

We show that there are no polynomial time algorithms for any of the above problems under standard 
complexity assumptions. Furthermore, there are no polynomial time algorithms that even approximate the 
optimal solutions to a guaranteed precision. Many of these hardness results also apply when we restrict our 
attention to Tanner graphs that correspond to LDPC codes. Our proofs can all be cast as reductions from the 
NP-hard Minimum Set Cover, Minimum Distance 37, 38, 15J, and Maximum Three- Dimensional Matching 
problems [5]. These, and some other relevant problems subsequently referred to, are briefly described in the 
following section. 

3 A Class of NP-hard problems 

For completeness, we provide known NP hardness and approximation results for a class of combinatorial 
optimization problems that will be used in the proofs of Sections El HI and [5] Most of the results presented 
in this section are available at [21] . 

1. The Minimum Set Cover Problem, MinSetCov: Given a set of sets <S = {Si, . . . , S a } of [b], 
find S' C S of minimum cardinality such that Uses' S — [b]. It is NP-hard to clogiV-approximate 
MinSetCov [30] for some c where N is the description length of the problem. Even in the case that 
| Si fl Sj\ < 1, for 1 < i < j i < a, it can be shown that there exists no polynomial time clogiV- 
approximation algorithm unless NP C ZTIME(N ol - Xo ^ N ^>) [53] where ZTIME(t) denotes the class 
of problems that have a probabilistic algorithm with expected running time t and with zero error 
probability. 

2. The Minimum Hitting Set Problem, MinHitSet: Given a set of subsets S = {Si, . . . , S&} of [a], 

find a set S' of smallest cardinality, such that |S' fl Sj| > 1, for alH = 1, 2, . . . , b. The MinHitSet 
problem is equivalent to the MinSetCov problem [5] and as a consequence it is also NP-hard to 
(clog 7V)-approximate MinHitSet [30] for some c > 0. In the case when |Sj| = 2 for all i e [b] the 
problem is often called the vertex cover problem MinVertCov. The vertex cover problem, even when 
we have \{i : j £ Sj}| < 3 is NP-hard to approximate up to some constant a > 1. 

3. The Maximum Three-Dimensional Matching Problem, MaxThreeDimMatch: Given a set 
T C X x X x X, determine if a set S C T of size \X\ exists such that no elements in S agree in any 
coordinate. This decision problem is NP-hard even if no element of X appears more than 3 times in 
the same coordinate of sets from T [5D] . 

4. The Maximum Likelihood Decoding Problem, MaxLikeDecode: Given a code C specified by 
an m x n parity-check matrix H (we may assume H has linearly independent rows), a vector s 6 F™, 
and an integer to > 0, determine if there is a vector x G i 7 ^ 1 with weight bounded from above by u) and 
such that H x T = s. The MaxLikeDecode problem is NP-hard to approximate within any constant 
factor [1]. 



5 



5. The Minimum Weight Codeword Problem, MinCodeword: Given a code C specified by an 
n x k generator matrix M of full row-rank, find the smallest weight of a non-zero codeword. The 
MinCodeword problem is not approximable within any constant factor unless NP C RP, where RP 
is the set of decision problems for which there exists a randomized algorithm that is always correct on 
no instances and correct with probability 1/2 on yes instances. 

4 Hardness of Approximation Results 

4.1 Hardness of Approximation for MinStop 

We start by showing that MinStop is not approximable within o(log N), where N denotes the description 
length of the problem, unless P = NP. This results improves upon the finding in [27] , where the weaker 
claim that MinStop cannot be approximated within any positive constant was proved. This improvement 
is a consequence of the fact that our proof relies on reduction from the MinSetCov, rather than the 
MinVertCov problem [27] . 

Theorem 1. There exists a constant c > such that it is NP-hard to (clog N)- approximate MinStop. 

Proof. The proof is by a reduction from MinSetCov. Let b = U$ e [ a ]iSi|, and without loss of generality, 
assume that S C [b], for each S G <S. Form a bipartite graph G = (L U R, E) with L = {u\, ...u a ,x,y}, 
R = {wi, ...Vb, wi, w a , z}, and edges 

E = {(ui,vj) : j G Si} U {(ui,Wi) : i G [a]} U {(x,v) : v 6 R} U {(y,v) : v G {wi, ...,w a ,z}} . 

An illustration of this graphical structure is given in Figure 4. 

We show that G has stopping distance 2 + 1 if and only if the minimum set cover is of size t. Since there 
is no polynomial algorithm returning an clogiV approximation for MinSetCov unless P = NP (for some 
sufficiently small c > 0), this establishes the theorem. 

Let S be a stopping set. Consequently, 

1. If (x G S or y G S), then (x G S and y G S) since otherwise d,G s {z) = 1. 

2. If x G S then Ui G S for some i since otherwise dG s ( v j) = 1 f° r some j G [b]. 

3. If Ui G S then (x G S or y G S) since otherwise dG s (wi) = 1- 

Therefore, if S is non-empty x,y,Ui G S for some i G [a]. But then dc s (i>j) > 2 for j G Si. However this 
means that for all j G [6], d Gs \^ x _ y -j(vj) > 1. Therefore, S 1 being a stopping set implies that the included Ui 
nodes correspond to a covering of [b]. The nodes corresponding to a covering of [b], in addition to x and y, 
form a stopping set, since every node on the right hand side (R) is in the neighborhood and has degree at 
least two. Hence the size of the minimum stopping set of G is exactly 2 plus the size of the minimum set 
cover. □ 

MinCStop: The proof of Theorem [1] also implies that there exists a c > such that it is NP-hard 
to (clogn)-approximate MinCStop. This is a consequence of the fact that the family of hard instances 
considered all had the property that the neighborhood of all stopping sets was all the check nodes. We next 
show that there exists a deterministic, polynomial-time, 0(logn)-approximation algorithm for MinCStop. 
This follows because we can relate MinCStop to MinHitSet as follows. 

For each r G R, create a set of sets S r that consists of all (|r(r)| — l)-subsets of T(r). For example, if 
r(r) = {a, b, c, d}, then S r — {(a,b,c),(a,b,d),(a,c,d),(b,c,d)}. Let S = {S r : r G R}. Then Q C L is a 
hitting set for S iff it is a cover stopping set of L. This claim can be proved in a straightforward manner: if 
<S contains at least one element, say a, from T(r), then it must contain at least two elements from the same 
set since otherwise, the (|r(r)| — 1) set that does not contain a will not be hit. 
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Figure 4: Reduction from MinSetCov to MinStop. 

Consequently, any a-approximation algorithm for MinHitSet can also be used to obtain an a-approximation 
algorithm for MinCStop. For example the following simple greedy algorithm can be shown to be an 0(log n)- 
approximation algorithm for MinHitSet: At each step add the element that appears in the most sets from 
S can remove these sets from S can repeat until all the elements chosen appear in every set from S. 

The greedy algorithm searches for cover stopping sets by going through the list of variable nodes in 
decreasing order of their degree, and it is straightforward to see that the algorithm terminates after at most 
(n — k) 5 max steps, where <5 max denotes the largest degree of any check node in the Tanner graph of the 
code. As a consequence, this algorithm is especially well suited for LDPC codes, to be formally defined in 
Section 

Hardness under Stronger Assumptions: Under the assumption that NP <f_ DTIME(N polylosN ), 
it was shown in [27) that there exists no polynomial time approximation algorithm for MinStop within 
2 (lo g JV) 1 - e j for any £ > Q 

4.2 Hardness of Approximation for Minlrapzp , MinTrapcA? and MinTrapAWGN 

We show next that the problems MinTrapzp, MinTrapqa, and MinTraPawgn are computationally at 
least as hard as the MinCodeword problem. 

Theorem 2. For any constant a, there is no polynomial-time a-approximation algorithm for MinTrapzp, 
unless RP = NP. 

Proof. Recall that unless RP — NP, there is no polynomial time MinCodeword problem is 0(l)-hard 
to approximate even under the restriction that the Tanner graph of the code is left regular. This follows 
directly from the results in [T5] . 

Given a Tanner graph G — (L U R,E) that is left regular say with degree [(£ — 1)/2J + 1, for each 
node u S L create £ — [(£ — 1)/2J — 1 new nodes in R each connected to u. Call the new Tanner graph 
G . Then any S C L is a ZP-trapping set in G" iff S is the support of a codeword in G. Hence any a- 
approximation algorithm for MinTrapzp yields an a-approximation algorithm for MinCodeword and the 
result follows. □ 

A very similar argument can be used to prove the following claim. 

Theorem 3. For any constant a, there is no polynomial-time, a-approximation algorithm for MinTraPqa> 
unless RP = NP. 
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Proof. Similarly as in the proof of Theorem [21 create for each node u £ L one new node in R each connected 
only to u. Call the new Tanner graph G". Then any S C L is a GA-trapping set in G iff 5 is the support of 
a codeword in G. This follows due to the fact that the first condition in the definition of GA-trapping sets is 
identical to the ZP-restriction, with I = 3. The second condition in the definition of an GA-trapping set is 
enforced automatically, since vertices in L \ S cannot be connected to odd-degree check nodes in Gs due to 
the fact that all such checks have degree one. Hence any a-approximation algorithm for MinTrapzp yields 
an a-approximation algorithm for MinCodeword and the result follows. 

□ 

Theorem 4. For any constant a, there is no polynomial-time, a-approximation algorithm for MinTrapawgn? 
unless RP = NP. 

Proof. The proof is by a reduction from MinCodeword, and follows along similar lines as the proof of 
the above theorems. To this end, we construct the Tanner graph (L U R, E) of the dual code C where 
L = {ui, ...Ufe}, R — {v%, ...v n }, and E — {(ui, vj) : -Mjj = 1} where M denotes a generator matrix of the code 
of full row-rank. Note that for each S C L, T(S) corresponds to a codeword. Hence, if we have an a-approx 
to the min-trapping set problem for any a, then this gives an a approximation algorithm to the minimum 
weight codeword problem by running through all values of a and taking the minimum of the resulting 6's. 
But, since it is impossible to 0(l)-approximate MinCodeword in polynomial time unless RP = NP [15], 
it is impossible to 0(l)-approximate MinTraPawgn in polynomial time unless RP = NP. □ 

4.3 Hardness of Approximation for MinTrapAWGN-eiem 
Theorem 5. For any a, it is NP-hard to a-approximate MiNTRAPAWGN-eiem • 

Proof. The proof is based on showing that a polynomial time algorithm for solving MiNTRAPAWGN-oicm can 
be used for solving the MaxThreeDimMatch problem, and is based on similar arguments as those used 
for showing that MaxLikeDecode is NP-complete [5]. To this end, let us construct the matching incidence 
matrix D as follows. Let the collection of ordered triples be T C X x X x X, where |T| = t, and |A| = n. 
Then D is a 3 n x t dimensional zero-one matrix, with entries 



/; - 
2n 



1 < i < n : 
1 < i < 2n 
Ki <3n 



D 



1,3 



1, iff X4 



D itj = 1, iff yj = i: 
Di j — 1, iff z 3 ■ = i. 



As an example, the matrix D for the set of triples 



{(1, 2, 2), (3, 2, 1), (2, 3, 1), (1, 2, 3), (2, 3, 3), (3, 1, 3)} 



over X = {1, 2, 3} has the form 



D = 



The set of triples {(1, 2, 2), (2, 3, 1), (3, 1, 3)} is a maximum three-dimensional matching over the set {1, 2, 3}. 
Observe that all rows in the sub-matrix of D induced by the three columns corresponding to these triples 
have Hamming weight one. This is a consequence of the defining constraint of the MaxThreeDimMatch 
problem that asserts that every element in X appears at a given position of the matching exactly once. 
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Assume next that there exists a polynomial-time, a-approximation algorithm for the MinTraPawgn— elem 
problem. Construct D for a given matching problem, set b = 3 x n, and run the MiNTRAPAWGN-cicm al- 
gorithm on D. If the algorithm the algorithm finds an elementary trapping set then it must have size n. 
Consider the corresponding set of n columns indexed by a set of n triples from T . Each row in the sub- 
matrix induced by the triples has weight one, which follows from the definition of an elementary trapping 
set. Consequently, these triples represent a matching for T. This implies that no polynomial time algorithm 
for the MiNTRAPAWGN-ciom problem exists, unless P=NP. □ 

4.4 Hardness of Approximation for MinTrapAWGN-maj 

First we prove a hardness of approximation result for the problem of finding the good set of minimum 
cardinality. Recall that a set S C L us good if the majority of nodes in T(S) have even degree in Gs- 
We call this problem MinGood. We will then use this to show a hardness of approximation result for 

MlNTRAPAWGN-maj- 

Our proof uses a reduction from MiN CODEWORD. Let H be the n x (n — k) parity check of some 
code. We may assume that the code specified by H includes at least one codeword in addition to the zero 
vector. This gives rise to the graph G' — [V U R', E') where L' — {xi, . . . , x n }, R' = {y\, . . . , y m }, and 
E' = {(xi,yj) : Hij — 1}. We will create a bipartite graph G = (LUR,E) by augmenting G' with graphical 
objects termed "ZigZag"s and "OrGate"s. These graphical objects will ensure that the minimum cardinality 
of a good set is approximately proportional to the minimum weight of any codeword. 

4.4.1 The ZigZag 

For each x £ L' we add a ZigZag(x) structure. This structure consists of 3(m — 1) nodes, given by 
L(ZigZag(x)) ={ui,..., v m _i}, i?(ZigZag(a;)) = {m, u m _i, u>i, . . . , w m -i}, and edges, 

£(ZigZag(a;)) = {(ui,Vi), (v i} Wi) : i £ [m - 1]} U {(vi,w i+1 : i £ [m - 2]} U {(a;,ti?i)} 

The intuition behind the ZigZag(a;) structure is that if x is in the trapping set then the nodes i(ZigZag(x)) 
will also be in the trapping set. For a subgraph G" of G, and S £ L we define 

Disc s (G") = \{v £ T{S) n V(G") : d Gs (v) even}| - \{v £ T(S) n V(G") : d Gs (v) odd}|. 

Lemma 1. For all x £ S, Disc s (ZigZag(x)) < and Disc s (ZigZag(a;)) = iff ZigZag(a;) H L £ S. 

Proof. Note that 

\{v £ T(S) n V(ZigZag(aO) : d Gs (v) odd}| > \{v £ S n y(ZigZag(x))| 

with equality iff L' fl V^(ZigZag(x)) C S because each Vi £ S is connected to Wi which has degree 1. But for 
any S, 

\{v £ T(S) n ^(ZigZag(z)) : d Gs (v) even}| < \{v £ S n t/(ZigZag(x))| 
with equality iff L' n F(ZigZag(x)) C 5. □ 

4.4.2 The OrGate 

For each y £ R' we add OrGate(y), and let T(y) D V — {ui, . . . ,uy }. Let k = 2n°S2 fc'l . The construction 
OrGate(y) consists two node sets L(OrGate(y)) and _R(OrGate(y)). Consider a binary tree on the nodes 
{ui, . . . , Ufc} where Uk'+i = Uk> for i £ [k — k']. Then L(OrGate(j/)) consists of nodes corresponding to the 
internal nodes of the tree, i.e. 

L(OrGate(y)) = {d Ui vii 2I • • • , v u k - 1 \/v. k ,Vu 1 \/v.2\/v.3\/v. i , ■ ■ ■ ,v Uk _ 3VUk _ 2 y Uk _ iyUk , . . . , Wuivu 2 v...v« fc } 

For each internal node v with children u and w, we add four new check nodes C(v) := {ci(v), C2(v), c 3 (u), c^v)}: 
all are connected v, the first and third are connected to u and the first and second are connected to w. If v 
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(a) OrGate(y) (b) ZigZag(a;) (c) Ballast 

Figure 5: Reduction from MinCodeword to MiNTRAPAWGN-maj- 



is the root of the tree, we also add one more new check node which is connected only to v. We call this node 
z. Let i?(OrGate(y)) be the set of such nodes, and let £?(OrGate(?/)) be the set of such edges. Finally, let 

f(S,y) = {v UiV ... VU:j G L(OrGate(y)) : \S n {u u . . . , «,-}| > 1} . 

Lemma 2. For all y G Gs PI R, Discs(OrGate(?/)) < -1 with equality if S D L(OrGate(y)) = f(S,y). 

Proof. Consider the four check nodes C(v) for some internal node v of the tree used in the construction of 
OrGate(y). Let u,w be the children of v in the original binary tree tree. Then, if cither u,v,w G S then 
Discs(C(v)) < with equality iff v € S and at least one of u, w G S. Consequently, if T(y) n L 1 n S ^ 0, 
Discg(U v C(v)) < with equality iff SO L(OrGate(y)) = f(S,y). In particular, the root of the binary tree is 
in S and therefore the final check node z has odd degree. Therefore, if Y{y)V\L' C\S ^ 0, Discg(U v C(v)) < — 1 
with equality if S H L(OrGate(y)) = f(S, y). □ 

4.4.3 Hardness of MinGood 

Note that the graph G that has been constructed has \L\ < mn + 2n(n — k) and \R\ < (n — m) + n 2 . 
Lemma 3. Discs(G) > iff S fl L 1 is a codeword and for each x G S n V , ZigZag(x) flLcS. 
Proof. According to Lemma [T] and [2] 

Discs(G) = Disc s (G") + J2 Disc s (ZigZag(x)) + Disc s (OrGate(y)) 

xeL' yeR' 

< Disc s (G') - / zi g Za g (. ) nL ! zs - |r(5) n R'\. 

xeL' 

Note that Discs (G') < W(S) HR'\ and therefore Discs (G) > implies that dc s {y) is even for all y G R 1 and 
ZigZag(x) G Gs for all x G S fl L' . Again, according to Lemma Q] and [2] if Vy G R', dG s (y) = mod 2, and 
VxeSH L', ZigZag(a;) C G s , then Disc s (G) > 0. □ 

Theorem 6. For any constant a, there is no polynomial-time, a -approximation algorithm for MinGood, 
unless RP = NP. 
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[ Ballast 1 
[ Ballast 1 
[ Ballast | 



Figure 6: Combining the ZigZag, OrGate, and Ballast constructions. 

Proof. Assume that S is a good set such that S < aMiNGoOD for some constant a. By Lemma [3] and 
Lemma [2] 

\S\ = \SnL'\m + E \f( s >v)\> 

ver(SnL') 

and SDL' corresponds to a codeword. But J2 V £r(SnL') \f(&i v)\ — 2n(n — fc), and so by setting m sufficiently 
large we get a constant approximation for MinCodeword. But no such approximation exists unless RP = 

np ma. □ 



4.4.4 Hardness of MinTrapAWGN- 



maj 



To achieve the hardness result for MiNTRAPAWGN-maj we need to further augment our graph G with multiple 
"Ballast" constructions. We call the resulting graph G + . The intuition behind Ballast is that no nodes from 
Ballast will be chosen in S while the multiple copies of Ballast will ensure that the complement of S is also 
good. A single Ballast consists of nodes L(Ballast) = {u\, . . . ,ui}, i?(Ballast) = {vi, . . . ,Vi,W2, ■ ■ ■ ,wi}, and 
edges, 

^(Ballast) = {{u h Vi) : i G [l]} U {{v u u l+1 ) : i G [I — 1]} U {(«,, u x )} U {(«<, w t ) : 1 < i < I - 1} . 
We consider setting / = n\L\ and adding \R\ copies of Ballast to G. 
Lemma 4. Disc5(Ballast) < 1 with equality iff L(Ballast) C S. 

Proof. Let A = SC\ {ui, . . . , u{\. Note that T(A) contains at least \A\ — 1 nodes with odd degree with equality 
iff L(Ballast) C S. T(A) contains at most |A| nodes of even degree with equality iff L(Ballast) C S. □ 

Lemma 5. Assuming there exists a non-zero codeword, there is a good set in G. Furthermore, any good set 
in G is a trapping set for G + . 

Proof. Let S' be the subset of V corresponding to the minimum weight codeword. Let 



S = S' U j |J L(ZigZag(x)) j U ( |J f(S,y) 



Then S is a good set in G. For the second part of the lemma note that by LemmalU for S C L, Disci;(G + ) > 
\R\ - \R\ = 0. □ 

Theorem 7. For any constant a, there is no polynomial-time, a -approximation algorithm for MinGood, 
unless RP = NP. 

Proof. Assume that S is a trapping set such that S < aMlNTRAPAWGN-maj f° r some constant a. By 
Lemma O we know that 15*1 < a\L\ and hence S does not include all left hand side nodes of any copy of 
Ballast because doing so would imply that 15*1 > |L(Ballast)| = n\L\. But then by Lemma[H we may assume 
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that no nodes from Ballast arc included in S because removing all such nodes from S increases Discs (G). 
Consequently S must be a subset of L. Since any good subset of L is a trapping set, MinGood(G) = 
MiNTRAPAWGN-maj(G + ). But, by Theorem^ there is no constant approximation of MinGood. □ 

5 Hardness of Approximation Results for Sparse Codes 

The fact that a problem is NP-hard usually does not imply that a special instance of the problems is NP-hard. 
Since iterative decoding algorithms have both linear-time complexity and offer good decoding performance 
only for special classes of codes, it is important to establish the analogues of the results in Section |4] for such 
codes. We provide next a set of results establishing the hardness of approximating stopping and trapping 
sets for low-density parity-check (LDPC) codes. 

LDPC codes are linear block codes for which the parity-check matrix H is sparse -i.e., for which H has 
a "small" number of non-zero entries. More formally, we define an LDPC code as follows. An LDPC code is 
a code with the property that each variable and check node in its Tanner graph G = (L U R, E) has degree 
at most 5 V and 5 C , respectively, for some constants S v , S c > 2 independent on n. 

Theorem 8. There exists a constant a > 1 such that it is NP-hard to a- approximate MinStop in the 
Tanner graph of an LDPC code. 

The proof follows along the same lines as the proof of NP-hardness using reduction from the problem 
MinVertCov problem [57]: Let G = (V, E) be an undirected graph, which, without loss of generality, can 
be assumed to be connected and of vertex degree bounded from above by three. Furthermore, also assume 
that | V" | — n, \E\ — m, and that E — {ex, . . . , e m }, V = {vi, . . . , v n }. Without loss of generality, one can set 
e\ = (vi,t<2) G E. A bipartite graph G vc is constructed as follows: the left hand side vertices of the graph 
consist of nodes L = Lq U Lx, where Lq — V, and L\ — {e! x , . . . ,e' m }. The right hand side vertices of the 
graph consist of nodes R = Ro U R\, with R — E, and Rx = {zx, ■ ■ ■ , z m }. The set of edges of G vc is a 
collection of ordered pairs the following form: 

{{e t e R a ,uE Lo), (e 4 G Rq,v G Lq) : e, ; = (u,v) G E} U {(e, G Rq, e' t G L 2 ) : 1 < i < m}U 

{(zi G Rx,e'i G Lx), {zi G Rx,e' i+1 G Lx) : 1 < % < m - 1} U {(z m G Rx,vx G L a ), (z m G Rx,e[ G Li)}. 

It is straightforward to show that if S is a stopping set in Q, then S D Lo is a vertex cover in Q [27]. As 
a consequence, there exists a constant e > such that there is no (1 + e) approximation algorithm for the 
MinStop problem, unless P=NP. 

Note that in the construction, each vertex in L has degree bounded from above by four (the auxiliary 
variable node e^, . . . , e\E\ have, by construction, degree two, while all vertices in V other than vx and i>2 have 
degree at most three; the vertices vx and v 2 can have degree at most four). Similarly, the check nodes have 
maximum degree three, since by construction, the vertices Zx, ■ ■ ■ , Z\e\ have degree two, while the vertices in 
Ro have degree three. 

One can establish the even stronger result that the MinStop problem for LDPC codes remains NP hard 
even for codes with Tanner graphs that avoid cycles of length four. This follows from the same arguments used 
in the proof of the theorem above, with an additional reference to the hardness of the MinSetCovInterOne 
problem, which also holds in the setting of sparse codes [23] . 

Theorem 9. There exists a constant a > 1 such that it is NP-hard to a-approximate MinTrapawgn— clem 
in the Tanner graph of an LDPC code. 

Proof. The proof follows along the same lines as the proof of Theorem[5j with the three-dimensional matching 
problem replaced by its constraint version involving a bounded number £ of appearances of each element in 
X. □ 

Theorem 10. The problems MaxLikeDecode and MinCodeword are NP-hard for LDPC codes. 
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Proof. The proof is a direct consequence of the fact that the parity-check matrix used in the reduction from 
the MaxThreeDimMatch to the MaxLikeDecode problem is sparse (it has column weight three, and 
the row weight can be made bounded as well by invoking the constraint that any element of X cannot appear 
more than r > 3 times) . The claimed result follows from the observation that there exists a polynomial-time 
reduction algorithm from the MaxLikeDecode to the MinCodeword problem [371 [35] • D 

As a consequence of the above finding, all trapping set problems described in Section [H for which the 
hardness was established in terms of reductions from the MinCodeword problem, remain NP-hard for the 
class of LDPC codes. 

6 Estimation of the Error-Floor 

The error floor is a phenomena inherent to iterative decoders that manifests itself as a sudden change in the 
slope of the BER performance of a code. Alternatively, it represents a phase transition in the dynamical 
system of the decoder that prohibits it from attaining a sufficiently low BER. The error floor usually appears 
at moderate to high signal-to-noise ratios, i.e. for small values of the erasure and error probability p of the 
BEC and BSC channel. For such values of p, the codeword error-rate R(p) has the form 

log(R(p))~\og(N K )+K log(p), (1) 

where k denotes the size of the smallest stopping/trapping sets, while iVj represents the number of such sets. 
The dominating term in the expression is the linear term k log(p). 

As a consequence of the results in Section S[ we have the following result. 

Corollary 6. Unless P = NP, there is no polynomial time algorithm for estimating the error-floor of codes 
used over the BEC and BSC within an O(l) term. 

For the AWGN channel with noise variance a 2 , a heuristic formula for the codeword error-rate was derived 
in |31j . where it was shown that 

R(a)>J2P(T,a), 
TeT 

where T denotes the set of dominant (small) elementary trapping sets for the given code, and P(T,o~) is 
the probability of decoder failure on a trapping set T. It was observed that simulation of decoding can be 
viewed as stochastic process for finding trapping sets [31) . This, and other methods that rely on combining 
simulation techniques with "aided flipping" methods and greedy search strategies, were all observed to be 
inefficient when estimating the error-floor of "good codes" - i.e. codes with large minimum stopping and 
trapping set sizes. In the next section, we show that some problems discussed in the paper has complexity 
that grows exponentially with the size of the smallest set being sought, but only polynomially with respect 
to the size of the input (i.e., code length). Consequently, one can easily find the smallest stopping sets of 
fairly long codes, provided that the size of such stopping sets is not greater than 10 — 15 [13] [T3J [33] . This 
was observed in several papers, including 34J. 

7 Fixed-Parameter Tractability 

Parameterized complexity represents a measure of the computational cost of problems that have several input 
parameters. Problems for which one of the parameters, say n, is fixed are called parameterized problems. 
There exist problems that require exponential running time in the parameter tt but that are computable in 
a time that is polynomial in the input size. Hence, if n is fixed at a small value, such problems can still 
be exactly solved in an efficient manner. A parameterized problem that allows for the existence of such 
polynomial time algorithms is termed a fixed-parameter tractable problem and it belongs to the class FPT, 
first studied by Downey and Fellows [13] . 
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Many NP-complete problems are fixed-parameter tractable. As an example, the MinVertCov is FPT, 
with complexity 0(nn + (4/3) K n 2 ), where K denotes the size of the smallest vertex cover, and n is the size 
of the input, i.e., the number of vertices in the graph. Despite the fact that MinVertCov is a special 
instant of MinHitSet with set sizes equal to two, the latter is not known to have FPT algorithms when 
parameterization is performed only with respect to the size of the smallest hitting set k. Strong evidence 
suggests that such an algorithm does not exist, since MinHitSet is JV[2]-complete (for the non-trivial 
definition of the W[2] class, see [H]). It is only known that MinHitSet is FPT when the set sizes are 
bounded, and parameterization is performed with respect to, say, n + 5 m&x , where 5 max denotes the size of 
the largest set in the MinHitSet formulation. 

In this section, we use the results of [51 [TH1 [33J to show that the MinCStop problem is FPT. Furthermore, 
by invoking the recent results in |llj . we show that the problem of enumerating all cover stopping sets is 



Theorem 11. The problem MinCStop for LDPC codes of maximal constant check node degree 8 C is in 
FTP, with best known complexity bound of the form 



The algorithm that achieves this bound is a tree search algorithm, see |16j . 

Theorem 12. The problem of enumerating all minimal cover stopping sets in LDPC codes of maximal 
constant check node degree S c is in FTP, with best known complexity bound of the form O* ((<5 C — 1 + o(l)) K ) , 
where O* refers to an O(-) function for which all polynomial factors are suppressed, and where K stands for 
the size of the smallest cover stopping set. 

As a final remark, the problem MinStop can be shown to be W[l]-hard, due to its connection to the 
Exact Even Set problem [5J. 

8 Conclusion 

We showed that a class of problems, pertaining to the size of the smallest stopping and trapping sets in 
Tanner graphs is NP-hard to even approximate. Furthermore, we showed that similar results apply to the 
class of LDPC codes. Our findings provide one of the few known families of codes for which the minimum 
distance and stopping set problems are NP-hard. We also show that a simple instance of the stopping set 
problem for LDPC codes, namely the complete stopping set problem, is fixed parameter tractable. 
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